End-to-End Neural Video Coding Using a Compound Spatiotemporal Representation

نویسندگان

چکیده

Recent years have witnessed rapid advances in learnt video coding. Most algorithms solely relied on the vector-based motion representation and resampling (e.g., optical flow based bilinear sampling) for exploiting inter frame redundancy. In spite of great success adaptive kernel-based convolutions deformable convolutions) prediction uncompressed videos, integrating such approaches with rate-distortion optimization coding has been less successful. Recognizing that each solution offers unique advantages regions different texture characteristics, we propose a hybrid compensation (HMC) method adaptively combines predictions generated by these two approaches. Specifically, generate compound spatiotemporal (CSTR) through recurrent information aggregation (RIA) module using from current multiple past frames. We further design one-to-many decoder pipeline to CSTR, including resampling, mode selection maps enhancements, them achieve more accurate prediction. Experiments show our proposed system can provide better motion-compensated is robust occlusions complex motions. Together jointly trained intra coder residual coder, overall yields state-of-the-art efficiency low-delay scenario, compared traditional H.264/AVC H.265/HEVC, as well recently published learning-based methods, terms both PSNR MS-SSIM metrics.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

End-to-end Video-level Representation Learning for Action Recognition

From the frame/clip-level feature learning to the videolevel representation building, deep learning methods in action recognition have developed rapidly in recent years. However, current methods suffer from the confusion caused by partial observation training, or without end-to-end learning, or restricted to single temporal scale modeling and so on. In this paper, we build upon two-stream ConvN...

متن کامل

End-to-End Learning of Motion Representation for Video Understanding

Despite the recent success of end-to-end learned representations, hand-crafted optical flow features are still widely used in video analysis tasks. To fill this gap, we propose TVNet, a novel end-to-end trainable neural network, to learn optical-flow-like features from data. TVNet subsumes a specific optical flow solver, the TV-L1 method, and is initialized by unfolding its optimization iterati...

متن کامل

End-to-End Optimized Speech Coding with Deep Neural Networks

Modern compression algorithms are often the result of laborious domain-specific research; industry standards such as MP3, JPEG, and AMR-WB took years to develop and were largely hand-designed. We present a deep neural network model which optimizes all the steps of a wideband speech coding pipeline (compression, quantization, entropy coding, and decompression) end-to-end directly from raw speech...

متن کامل

End-to-end esophagojejunostomy versus standard end-to-side esophagojejunostomy: which one is preferable?

 Abstract Background: End-to-side esophagojejunostomy has almost always been associated with some degree of dysphagia. To overcome this complication we decided to perform an end-to-end anastomosis and compare it with end-to-side Roux-en-Y esophagojejunostomy. Methods: In this prospective study, between 1998 and 2005, 71 patients with a diagnosis of gastric adenocarcinoma underwent total gastrec...

متن کامل

End-to-End Navigation in Unknown Environments using Neural Networks

We investigate how a neural network can learn perception actions loops for navigation in unknown environments. Specifically, we consider how to learn to navigate in environments populated with cull-de-sacs that represent convex local minima that the robot could fall into instead of finding a set of feasible actions that take it to the goal. Traditional methods rely on maintaining a global map t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology

سال: 2022

ISSN: ['1051-8215', '1558-2205']

DOI: https://doi.org/10.1109/tcsvt.2022.3150014